Using Mahalanobis distance to compare genomic signatures between bacterial plasmids and chromosomes
نویسندگان
چکیده
Plasmids are ubiquitous mobile elements that serve as a pool of many host beneficial traits such as antibiotic resistance in bacterial communities. To understand the importance of plasmids in horizontal gene transfer, we need to gain insight into the 'evolutionary history' of these plasmids, i.e. the range of hosts in which they have evolved. Since extensive data support the proposal that foreign DNA acquires the host's nucleotide composition during long-term residence, comparison of nucleotide composition of plasmids and chromosomes could shed light on a plasmid's evolutionary history. The average absolute dinucleotide relative abundance difference, termed delta-distance, has been commonly used to measure differences in dinucleotide composition, or 'genomic signature', between bacterial chromosomes and plasmids. Here, we introduce the Mahalanobis distance, which takes into account the variance-covariance structure of the chromosome signatures. We demonstrate that the Mahalanobis distance is better than the delta-distance at measuring genomic signature differences between plasmids and chromosomes of potential hosts. We illustrate the usefulness of this metric for proposing candidate long-term hosts for plasmids, focusing on the virulence plasmids pXO1 from Bacillus anthracis, and pO157 from Escherichia coli O157:H7, as well as the broad host range multi-drug resistance plasmid pB10 from an unknown host.
منابع مشابه
A genome-wide scan to detect signatures of recent selection in Australian Merino sheep
Domestication and selection are processes that conserve the pattern of genetic diversities between and within populations. Identification of genomic regions that are targets of selection for phenotypic traits is one of the main aims of research in animal genetics. An approach for identifying divergently selected regions of the genome is to compare FST values among loci to estimate the genetic v...
متن کاملAn Evaluation of Mahalanobis-Taguchi System and Neural Network for Multivariate Pattern Recognition
The Mahalanobis-Taguchi System is a diagnosis and predictive method for analyzing patterns in multivariate cases. The goal of this study is to compare the ability of the Mahalanobis- Taguchi System and a neural-network to discriminate using small data sets. We examine the discriminant ability as a function of data set size using an application area where reliable data is publicly available. The...
متن کاملIdentifying Useful Variables for Vehicle Braking Using the Adjoint Matrix Approach to the Mahalanobis-Taguchi System
The Mahalanobis Taguchi System (MTS) is a diagnosis and forecasting method for multivariate data. Mahalanobis distance (MD) is a measure based on correlations between the variables and different patterns that can be identified and analyzed with respect to a base or reference group. MTS is of interest because of its reported accuracy in forecasting small, correlated data sets. This is the type o...
متن کاملApplying the Mahalanobis-Taguchi System to Vehicle Ride
The Mahalanobis Taguchi System is a diagnosis and forecasting method for multivariate data. Mahalanobis distance is a measure based on correlations between the variables and different patterns that can be identified and analyzed with respect to a base or reference group. The Mahalanobis Taguchi System is of interest because of its reported accuracy in forecasting small, correlated data sets. Th...
متن کاملGenome DNA Sequence Variation, Evolution, and Function in Bacteria and Archaea.
Comparative genomics has revealed that variations in bacterial and archaeal genome DNA sequences cannot be explained by only neutral mutations. Virus resistance and plasmid distribution systems have resulted in changes in bacterial and archaeal genome sequences during evolution. The restriction-modification system, a virus resistance system, leads to avoidance of palindromic DNA sequences in ge...
متن کامل